
 \paragraph{Kinds of Languages}
  
  
\ The number of languages in the world is estimated at around
4000-6000 \cite[ch.\ 1]{song01typology}, and despite the immense diversity in their structure and characteristics it is striking to realize how similar the different languages are in the principles
underlying their organization.  {\em Linguistic typology} (henceforth {\em typology}) is a field of research that systematically
studies cross-linguistic variation and seeks to find structures and properties that are common
to all human languages   \cite{comrie89typology,croft90typology}.
%In typology,  cross-linguistic comparability is approached from a  {\em functional} point of view, that is, we assume  natural language is primarily   a communicative device \cite{Mairal06universals}, and that   languages employ different language-specific {\em forms} in order to express a set of similar  language-independent {\em functions}.   

Let us take the well-known linguistic notion of transitivity as an example. In all languages without exception, a transitive senence,  is a sentence that describes an event in which an activity  is carried over from an agent-like participant to a patient-like participant.  "John loves Mary", "Cinderella ate an apple" and  "IBM bought Lotus" are transitive sentences. In a transitive sentence we expect to find, at least,  elements that correspond to the activity, to the agent-like participant, and to patient-like participant. In linguistics, these elements are referred to as the grammatical relations  {\em predicate}, {\em subject} and {\em object} of the sentence respectively, and the set of grammatical relations {\em predicate, subject, object}  is referred to as the {\em predicate-argument structure} of the sentence. 


While the grammatical relations {\em subject of} or {\em object of} that indicate "who did what to whom" in a simple sentence and are to be primitive notions of any syntactic representation \cite{postal77gr}, these grammatical relations may be realized through a variety of forms, in different languages, ranging from   the organization of words in the sentence, to varying  the shape of individual words.  In linguistics, the part of the grammar that is in change of the organization of words in the sentence is called {\em syntax} and the part of the grammar that is in charge on the formation of words in the lexicon is called {\em morphology}.
An important part of typological study is to study the  mapping of  grammatical relations in the sentence to the form in which they are realized in different languages, which often involves a mix of syntactic and morphological criteria.



 \subsubsection{Morphological Typology}\label{sec:morph}
 
% Modern typologists view the morphological typology as one dimension of variation,  alongside the word-order typology  we surveyed and many others.
%Morphology studies word-formation processes for the purpose of the systematic description of the form of words. 
A long-standing tradition  classifies languages into types with respect to their {\em morphology}, the level of linguistic description that is concerned with the complexity of word-formation processes and  the surface forms of words. 


Modern typologists  \cite{sapir21language,comrie89typology} suggest the  morphological classification of languages to be the result of the interaction of different  parameters. The  {\em morphological synthesis} parameter  characterizes languages according to their morpheme-per-word ratio, and it is along this dimension that the distinction between {\em isolating} and {\em (poly)synthetic} languages is drawn. 


% \cite{comrie89typology}. 
%Linguists at the early nineteenth century believed that  morphological types  can describe the grammatical behavior of a language as a whole.  %
Classical {\em morphological typology} assigns  languages
to one of the following four ideal types: {\em isolating},
{\em agglutinative}, \emph{fusional}\footnote{Also known as {\em (in)flectional} \cite{comrie89typology}, but I refrain from using this term to avoid confusion with inflections in agglutinative languages.} and {\em incorporating} (or {\em polysynthetic}) languages. These types reflect
%Morphological typology} which classifies the
correspondence patterns between properties of words and surface formatives, also known as  {\em  morphemes}, the smallest units of sound-meaning correspondence in the language.
% \cite{bloomfield33language}.
%which are defined as follows.
%%
An {\em isolating} language is a language in which there is a one-to-one correspondence between words and morphemes, e.g.,  
% i.e., the smallest units of sound-meaning correspondence in the
%language. 
%A prototypical isolating language is 
Vietnamese.
 
\begin{exe}
\ex Vietnamese \cite[p.\ 43]{comrie89typology}
 \begin{xlist}
\ex
  {\em Khi toi den nha ban toi, chung toi bat dau lam bai}\\
when I come house friend I {PL} I begin do lesson \\
`When I came to my friend's house we began to do lessons' 
\end{xlist}
\end{exe} 


Finally, {\em incorporating} or {\em polysynthetic} languages are languages  which allow for the incorporation of multiple (lexical or grammatical) morphemes  to form a single word. Incorporation is a special case of polysynthesis  in which only {\em lexical} morphemes (`radicals', as opposed to function morphemes) may be combined. The Eskimo language Yup'ik is known to be a polysynthetic language.


\begin{exe}
\ex Central Alaskan Yup'ik  \cite[ex.\ (1) ]{mithun07roots}
 \begin{xlist}
\ex
  {\em micuumiiteqapiartua}\\
{\em mit`e -yuumiite-qapiar -tu-a} \\
alight -not.want-really -{IND.INTR.MOOD-1SG} \\
I really don't want to land
%when I come house friend I PLURAL I begin do lesson \\
%When I came to my friend's house we began to do lessons 
\end{xlist}
\end{exe} 


The  {\em fusion} parameter classifies to what extent it is possible to recognize the boundaries of  different morphemes, and it is the dimension along which  the distinction between {\em agglutinative} and {\em fusional} languages is materialized. (Poly)synthetic languages can be either agglutinative (e.g., Chukchi) or fusional (e.g., Eskimo) \cite{comrie89typology}.


An {\em agglutinative/agglutinating} language is a language in which
multiple morphemes may combine together to form a word, and the
boundaries between the combined morphs are clear. We illustrate such processes by a fraction of the Turkish morphological paradigm of
the concept ``adam'' (a man), where the morphemes corresponding to the properties ``{PL}[ural]'' and ``Genitive'' are simply concatenated onto
the stem. 

  
\begin{exe}
\ex Turkish, adapted from   \cite[p.\ 44]{comrie89typology}
 \begin{xlist}
\ex
  {\em adam-lar-in}\\
man-{PL}-Genitive \\
`of men'
%when I come house friend I PLURAL I begin do lesson \\
%When I came to my friend's house we began to do lessons 
\end{xlist}
\end{exe} 

A {\em fusional} language is  again a language in which multiple
morphemes can combine to form a word, but the boundaries between the different morphs are  hard or  impossible to establish. Latin  
illustrates such phenomena; there are no separable morphs  realizing properties such as  `` {S}[in] {G}[ular]'', `` {F}[eminine]'',  or `` {ACC}[usativity]'' in the different forms corresponding to a single paradigm. 

  
\begin{exe}
\ex Latin  %\cite[p.\ 367]{comrie89typology}
 \begin{xlist}
\ex
  {\em Puell-an bel-am amo}\\
beautiful-1{SG.F.ACC} girl-1{SG.F.ACC} love-1{SG.PRS.IND}  \\
`I love the beautiful girl'
%when I come house friend I PLURAL I begin do lesson \\
%When I came to my friend's house we began to do lessons 
\end{xlist}
\end{exe} 
 

The sets of languages that correspond to these ideal types turn out  not be mutually exclusive.  A {\em polysynthetic}  language for example may be of the  {\em agglutinative}  type if the way multiple morphemes combine to form a single word is transparent, or it may be highly fusional. 

\subsubsection{Word-Order Typology}



% \begin{quote} ``[As indicated by the title,] attention has been concentrated largely, but by no means exclusively, on questions concerning morpheme and word order. The reason for this choice was that previous experience suggested a considerable measure of orderliness in this particular aspect of grammar.''  \cite{greenebrg63order}
%\end{quote}
 
%\draftnote{Looks to me like this first sentence tries to do about three things all at once. Rewrite.}

Another important example for cross-linguistic diversity is the classification of languages according to {\em basic word order} \cite{greenberg63order}.   %and one of the most prominent research areas in linguistic typology is the
%that is, the {\em basic word-order typology}.
 %The {\em basic word-order typology}  is  one of the most prominent research areas in linguistic typology,  initiated by the work of Joseph Greenberg 
The  observation is that languages show radical differences in the order in which the linguistic elements {\em predicate} (V), {\em subject} (S) and {\em object} (O) are positioned in a  transitive sentence.
The {\em basic word-order} of a language is defined to be the order of the grammatical elements representing {\em V,} {\em S}  and {\em O} in a transitive, pragmatically neutral, unmarked
sentence \cite[chapter 1]{song01typology}. 
Remarkably, all six logically possible permutations are attested in natural languages (\ref{ex:order}).

\begin{exe}
\ex\cite[chapter 1, examples (1)--(6)]{song01typology}
\label{ex:order}\begin{xlist}
\ex Korean {\em (SOV)}
\gll kiho-ka saca-lil cha-ass-ta\\
Keeho-{NOM} lion-{ACC} kick-{PST-IND}\\
\trans{``Keeho kicked the/a lion"} 
\ex Thai  {\em (SVO)}
\gll khon n\'{i}i k\`{a}t maa tua n\'{a}n
\\ Man this bite dog {CL} that
\\ \trans{``This man bit that dog''} 
\ex Welsh {\em (VSO)}
\gll Lladdodd draig ddyn\\
killed dragon man\\
\trans{``A dragon killed a man''}
\ex Malagasy {\em (VOS)}\\ 
nanasa ni lamba ny vehivavy\\
wash the clothes the woman
\trans{``The woman is washing the clothes''}
\ex Panare {\em (OVS)}
\gll pi' kokamp\"{o} unki' \\
child washes woman \\
\trans{``The woman washes the child''}
\ex Nad\"{e}b {\em (OSV)}
\gll sam\={u}\={u}y yi qa-w\`{u}h \\
howler-monkey people eat \\
 \trans{``People eat howler-monkeys''}
\end{xlist}
\end{exe}
 
Greenberg  \cite{greenberg63order} attempted to set up a typology describing  the word-order patterns' distribution across languages.  To this end, he gathered a collection of about 30 languages covering a variety of language families from different genetic and geographical distributions, and  classified the languages into types reflecting their basic word-order pattern. Based on evidence from his sample, he observed that  {\em VSO, SVO} and {\em SOV} types  are empirically dominant, whereas languages in which {\em O} precedes {\em S} are  excessively rare \cite[universal 1]{greenberg63order}. Greenberg also investigated word-order  patterns    within non-clausal categories, capturing the relative positions  of, e.g., adpositions and nouns, nouns and adjectives, nouns and genitives, and so on. The order  of nouns, adjectives and adpositions  in conjunction with the three basic word-order  types Greenberg identified gives rise to twelve logical co-occurrence  possibilities, out of which only seven  are  attested   in Greenberg's sample. All in all,  Greenberg   \cite{greenberg63order}  articulated as many as 45 universal statements concerning the order  of meaningful elements in   different languages.
  
%it turns out that not all of the logical possibilities are attested in Greenberg's sample. Based on  these data Greenberg formulates  additional generalizations, such as the observation that  for languages with prepositions the genitive almost always follows the noun \cite[universal 2]{greenberg63order}.



%The systematic patterns emerging from these statements  encouraged Greenberg and his followers to try and find a \draftreplace{single, or a handful of, general principle(s)}{single general principle, or a handful of principles,} from which multiple universal patterns can be derived and according to which they can be explained  \cite{lehmann73structural, vennemann74order,hawkins83order}. 
%The identification of such principles 
%is  hoped to reveal the order underlying the organization of
%natural language grammars and 
 %cannot be over-stated ---  these are precisely the principles put
% to put limits on the space of {\em possible} human language,
% in human
%language, 
%which is the next  step  beyond
%classification \cite[p.\ 44]{croft}.
%Greenberg  himself attempted to explain his order universals as resulting from theinteraction of  dominant orders and harmony principles,  favoring the alignment of recessive elements with dominant ones. 
%Intuitively speaking, 
%Dominant orders are those appearing 
% at the implicatum of universals and
% (cf.\ the `independent'
%parameters), the other ones being `recessive'. 
%The interaction between the
% two principle serves to explain  the empirical distribution of language types  and a large number of individual universals in one stroke.
%
%Lehmann \cite{lehmann73structural} replaces Greenberg's verb-based typology with a bipartite {\em VO-OV} typology suggesting  that  the order of modifying-modified elements  is firmly determined by the uninterrupted sequence of the verbal and nominal  elements in the clause (his  {\em Fundamental Principle of Placement (FPP)})  \cite[p.\ 56]{song01typology}. 
%%
%Vennemann \cite{vennemann74order} \draftreplace{pertains to}{sticks to} the {\em VO-OV} typology of Lehmann \cite{lehmann73structural} but   articulatesthe idea that the order of {\em operators} (i.e.,  dependents, modifiers) and {\em operands} (i.e.,   heads, modified) tends to be realized in one direction; {\em  operator-operand} in {\em OV} languages, and {\em operand-operator} in {\em VO} languages  (his  {\em Principle of Natural Serialization (PNS)}). 
%The status of operators and operands had been
%called into question as their definition  tends to be theory
%dependent, and  in any event, the
%the context of Greenbergs
%The empirical evidence for the PNS predictions howeve  was limited; many  languages in Greenberg's sample deviate from them.
  
%Hawkins \cite{hawkins83order}  acknowledged the existence of counterexamples and inconsistencies  in his  extended sample of 300 languages and worked towards  integrating  inconsistencies back into  the language universals system. He did so by sharpening the theoretical tools and independently motivating their means of explanation. For instance, he used   cognitively motivated principles such as the interaction of {\em heaviness} and {\em mobility} constraints.  He also   suggested to study {\em  distributional} typology, and quantified the deviation from a consistent operator-operand serialization patterns using his  {\em Principle of Cross-Category  Harmony}  \cite[p. 75-76]{song01typology}.



Mithun \cite{mithun92order}  challenges the view that basic word-order is a universal property altogether and  shows that for some Australian languages, none of the syntactic criteria for determining basic word ordering can be faithfully applied. In such languages, the order of  elements in the sentence is determined on pragmatic, rhetoric and/or stylistic grounds.  Such languages, in which word-order is pragmatically, rather than syntactically, determined, are called {\em
  free word-order} languages; a canonical example for such a language is  Warlpiri \cite{hale83warlpiri}.  
  
  Generative grammarians further introduce the notion of  {\em scrambling} to refer to similar, pragmatically-driven, word-order variation,  in languages for which a canonical word-order pattern is defined \cite{ross67phd}. Scrambling languages are classified into word-order types  but various  nominals are seen to freely `move' within and across certain regions of the sentence.  This happens, for instance, in  German,  where the canonical word-order pattern in main clauses is SVO as it is in English,  but a freeness is evident in the positioning of nominals in sentence initial position, and in the {\em mittelfeld}.

The availability of free word-order languages and `scrambled' languages makes it hard to classify languages  into ideal types. This gives rise to   word-order tendencies, rather than classification as a clear-cut notion. Languages can be seen as forming a continuum as   in figure \ref{fig:wo}, that reflects their word-order tendencies. As the order of elements realizing  grammatical relations becomes freer and less systematic, it becomes  essential that this information is provided by other components of the grammar, for instance, the morphological form of words.


\begin{figure}
\begin{center}`SVO' ------------------------------------------------ `Free'   \\
Chinese \(< \) English  \(<\)  German \(<\) \ldots \(<\) Warlpiri
%VSO \(<\) SVO \(<\ldots\ldots<\) SOV
\end{center}
\caption{An alternative, graded, representation of word-order types}\label{fig:wo}
\end{figure}
       

%The distinction between {\em synthetic} and {\em polysynthetic} languages, based on their morpheme-to-word ratio, is a matter of degree according to Sapir \cite{sapir21language}, with a continuum spanning from isolating languages on the one extreme to polysynthetic languages on the other, as in figure~\ref{fig:synthesis}.

%Fusion is also more appropriately seen as a scalar classification rather than classifying into pure types, and there exist many languages  which are not easily classifiable into ideal types, as illustrated in figure~\ref{fig:fusion}.
%

%This graded classification along multiple dimensions allows for a large space of morphological types to be combined with  different word-order patterns as we observed in the previous section, which results in  the high variation in realization patterns that typologists observe across languages.


\paragraph{Kinds of Morphemes}

Morphosyntactic processes:

Case marking

Agreement

Construct state

Clitics as Phrase-level morphology

\paragraph{Kinds of Models}

 
